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Abstract 

Conformational transitions are ubiquitous in biomolecular systems, have significant functional 
roles and are subject to evolutionary pressures. Here we provide a first theoretical framework for 
topological transition, i.e. conformational transitions that are associated with changes in molecu¬ 
lar topology. For folded linear biomolecules, arrangement of intramolecular contacts is identified 
as a key topological property, termed as circuit topology. Distance measures are proposed as re¬ 
action coordinates to represent progress along a pathway from initial topology to final topology. 
Certain topological classes are shown to be more accessible from a random topology. We study 
dynamic stability and pathway degeneracy associated with a topological reaction and found that 
off-pathways might seriously hamper evolution to desired topologies. Finally we present an algo¬ 
rithm for estimating the number of intermediate topologies visited during a topological reaction. 
The results of this study are relevant to, among others, structural studies of RNA and proteins, 
analysis of topologically associated domains in chromosomes, and molecular evolution. 
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I. INTRODUCTION 


Arrangement of the contacts in a folded linear polymer chain is a topological property of 
the chain. This arrangement, called circuit topology, has been rigorously defined for linear 
chains with intra-chain contacts and its use in studying the equivalence of folded molecular 
chains has been proposed l|, l2|]. Molecules can vary in size and sequence and yet have 
identical circuit topology. The topology framework allows for classification of chains into 
topological equivalence classes. This approach is applicable to biomolecular chains ranging 
from proteins, to RNA to complete chromosomes, all of which are folded linear chains held 


Q. 


together by intra-chain contacts 

For two chains that do not fall within the same topological class, we lack a distance 
measure to quantify their differences. Distance measures are crucial for building molecular 
phylogeny based on topology and for describing topological dynamics of chains. An appro¬ 
priate metric structure allows for the construction of an evolutionary landscape , probing 


of processes such as ep.sfasis W. aud modeliug format.ou and dynam.cs of topolog.cally 
associated domains in chromosomes [5|, l6[. It can also serve as a reaction coordinate for 
conformational reactions such as folding 7|. Development of metric structures on a space 
of circuit topologies may have far reaching implications beyond polymer science. Similar 
linearly structured objects appear in areas other than biomolecular sciences 8l-ll0|. These 
structures, though very different in nature, may share generic structural and dynamical 
properties. 

For a simple model of a folded polymer chain, we define distance measures based on a 
set of reasonable local changes in the space of contact configurations, and use a minimum 
assignment algorithm to estimate one of the proposed distances. Because a topological 
treatment is agnostic to length, the model can be simplihed by setting the length of every 
chain segment to any desired value, e.g. unity. The chain can then be readily represented 
by a graph in which the nodes correspond to the contact sites and the links represent intra¬ 
molecular interactions. The simple chain model allows us to disentangle topology from other 
structural features of molecules. 

Despite the simplicity of our model, it is still a challenge to investigate topological dy¬ 
namics of the chain and to identify conformational reaction pathways. Sampling from the 
exponentially large space of the possible pathways could be very time consuming for dis- 
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ordered and frustrated energy functionals of pathways, let alone the global constraint of 
connectivity in a pathway. An efficient way of sampling from such energy landscapes in 


sparse 


11 


y (weakly) interacting systems is provided by the cavity method of statistical physics 


12| . relying on the Bethe approximation. The recursive and local nature of these equa¬ 


tions are exploited in approximate message-passing algorithms that have proven useful in 
the study of random constraint satisfaction and optimization problems [^, 141. In particu¬ 
lar, the cavity method could be very helpful when we have to deal with global or nonlocal 


constrain 


problem 
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3, e.g. 


the connectivity constraint in the minimum-weight Steiner tree 
15| . and in the study of stochastic optimization problems 2^, where computing 


the energy function is already hard. In all of these examples, it would be computation¬ 
ally expensive (if not impossible) to study large-scale problem instances with the standard 
optimization algorithms based on the Monte Carlo sampling. 

In this article, we map the entire space of contact conhgurations and study how two molec¬ 
ular conhgurations interconvert by the rearrangement of contacts. For this aim, we construct 
connectivity graphs of link conhgurations related by the above local changes used in the dis¬ 
tance measures. Then, we employ an exhaustive search algorithm (for small systems) to 
study optimal evolution with respect to an appropriate energy functional of pathways in 
the conhguration space. We also present an approximate message-passing algorithm for 
the optimal evolution problem in larger systems. To address the connectivity constraint of 
the evolution, we have to dehne some intermediate auxiliary variables, making the problem 
amenable to local message-passing algorithms. This allows us to hnd reasonable approximate 
solutions for the optimal paths connecting two boundary conhgurations. 


II. DEFINITIONS 

Our graph representation of a chain with M contacts includes M links with endpoints 
= (b,jz) labeled by I = 1,...,M, where (Figured]). Here, a link 

conhguration is dehned by L = = 1,... ,M}, where each endpoint takes part in 

one and only one contact. Note that p ^ ji and links are not directed (b, jz) = Any 

two links are in one of the three states: parallel (p), series (s), or cross {x) with respect to 
the backbone chain. The chain is directed from left to right {1, 2,..., 2M — 1, 2M}. We 
are interested in topologically diherent link conhgurations represented by an M x M matrix 
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FIG. 1. Illustrating the possible arrangements of contact pairs, and effects of different local changes 
on a link conhguration. The links are ordered from left to right according to their first endpoint. 
Two links can be in parallel (p), series (s), and cross (x) states. The left panels show how a local 
change of type I (left-middle) and II (left-bottom) affect on links 1 and 3. The right panel shows: 
(right-top) a local change of type M~, where a contact is removed from the system, (right-bottom) 
a local change of type M"*“, where a contact is created, and (right-middle) a local change of type 
M* = M~^M~, which is a combination of the above local changes. 

G {p,s,x}. For simplicity, here we do not care if such a configuration has a physical 
three dimensional realization. 

A link conhguration can also be represented by a perfect matching of the 2M endpoints 
i = 1,, 2M. Here, the conhguration is identihed by a set of connectivity variables C = 
{cij = 0,1 |f < j} showing the absence or presence of connections between the endpoints. 
Each perfect matching dehnes a class of topologically equivalent link conhgurations related 
by a permutation of the link labels. The number of such perfect matchings is (2M — 1)!! = 
(2M — 1) X (2M — 3) • • • X 1, and for each one there are Ml ways of labeling the links. In 
other words, there are Ml matrices A representing the set of topologically equivalent link 
conhgurations. Given a link conhguration, one can easily construct the unique matrix A; 
the endpoints ei = {ii,ji) and e// = are enough to identify the element Ai^ii{ei,eu) G 

{p,s,x}. 

A link conhguration of M links has N = M (M—1) /2 pairs of links that can be partitioned 
into three disjoint subsets of size Np,Ns,Nx depending on their relative state p,s,x. The 
links can have an arbitrary labeling, but a convenient one that we will use later is the 
one in which the links are labeled from left to right according to the order of their hrst 
endpoints. At some point we will consider structured or modular link conhgurations with 
groups of links organized in well separated communities or components. We will consider 
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simple modules of m links with all the link pairs of type q = p,s, x, represented by qm. For 
example, qimiq 2 m 2 shows a configuration with two modules of types qi,q 2 and number of 
links nil, m 2 , respectively. Figure [U displays the main dehnitions and notations we use in 
this paper. 


III. DISTANCE MEASURES 

Given a link configuration, one can change the link arrangement in different ways to obtain 
a nearby configuration. In this study, we will use the following local changes to construct a 
connectivity graph Q of configurations and to define appropriate distance measures in the 
space of contact conhgurations. 

(I) Consider two links I and I' with endpoints ei = {ii,ji) and e^/ = {ii',ji'), respectively. 
From this we construct the other two topologically distinct configurations: ei = {ii, iu), eu = 

and ei = {ii,jii),eii = obtained by an interchange of the endpoints. We call 

this a local change of type I (LC-I). This changes not only but may also change other 
matrix elements and A;//;/. As a result, the numbers may considerably change by 

a LC-I. 

(II) Consider two neighboring endpoints {k,k + \) on the chain belonging to two different 

links, say ei = {ii,ii = k) and e^/ = {iii = k + l,j 11 ). Then we change the order of the 
neighboring endpoints to obtain = k + 1) and {ii' = k,jir), see Fig. [T] We call this 

a local change of type II (LC-II). This only changes the matrix element Ai^i. Here, the 
numbers A), 5 ^ 3 . change smoothly (at most by one). 

Starting from a link conhguration we can obtain all the other ones from permutations 
of the endpoints i = 1, 2 ,..., 2 M and the link labels I = 1, 2 ,..., M; there are 2 ^M! 
permutations of the endpoints and the link labels that lead to a topologically equivalent 
conhguration. In fact, starting from an arrangement of the endpoints we can reach to any 
other one by a sequence of elementary transpositions swapping two adjacent endpoints. 
Thus, the space of link conhgurations is connected under the LC-II updates. Figure [2] shows 
a small connectivity graph Qjj of link conhgurations related by the LC-II. We see that there 
is no odd loop in this graph; we always need an even number of local changes of type II to 
return to a given link conhguration. 

Note that the number of links M is hxed in the above local changes. But we may consider 
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M=4, # node5=105 


M<=4, # nodes = 124: g=s4, b=p4, y=x4, w=p2p2, p2x2, x2x2, c=sp3, sx3 



FIG. 2. The graph representation of link configurations (shown by nodes) for M = 4 (left) and 
M < 4 (right). Two nodes in the left panel are connected if the corresponding link configurations 
are related by a local change of type II. A connection in the right panel means the two configurations 
are related by LC-II or LC-M^. The special link configurations are: s4 (green), pA (blue), x4 
(yellow), p2p2,p2x2, x2x2 (white), and sp3, sx3 (cyan). The bigger nodes inside the circle in the 
right panel show configurations with a smaller number of links. 

the case in which one link can be removed from or added to the system. We say two link 
configurations (with different number of links) are connected by a local change of type 
if the larger configuration (with larger number of links) can be obtained by adding one link 
to the other configuration. The process in which one link is removed from the system (LC- 
M~), and is replaced with another link connecting two new endpoints (LC-M’*'), defines 
another local change (LC-M*) in the subspace of fixed M. Note that each endpoint belongs 
to one and only one link. When we add (remove) a contact we also add (remove) the two 
associated endpoints. Moreover, we could also allow for different local changes to happen; 
for example, two link configurations can be related by either LC-II or LC-M^. Figure [2] 
shows such an extended connectivity graph of link configurations Gn+M± with M < 4 links. 

Based on the above local changes we define the following distance measures: 

(DI) Given the connectivity patterns Ci, C 2 of the endpoints in two link configurations, 
we can easily compute the Hamming distance Zi)/(Ci, C 2 ) = ~ of fho fwo 

configurations. Here 6c^c' = 1 if c = c', otherwise (5c,c' = 0. A local change of type I increases 
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this distance by one, but it could result to considerable changes in other macroscopic prop¬ 
erties of the chain. For instance, the number of links in the shortest path connecting the 
two ends of the chain (end-to-end distance) is very sensitive to a LC-I. In other words, there 
are very close link configurations (according to this measure) with very different end-to-end 
distances. The situation is of course better for the average shortest distance (the minimum 
number of links needed to go from one endpoint to another one). 

(DII) The local changes of type II suggest another distance measure DniLi, L 2 ) as the 
minimum number of LC-IIs we need to go from configuration Li to L 2 . Both the end-to-end 
and the average shortest path distances are more correlated with this measure than the 
previous one. Notice that there is no odd loop in Qu] any two link configurations connected 
by a path of even length can not be connected by another path of odd length. Therefore, 
all the paths connecting the same boundary configurations have even (odd) lengths. The 
number of link configurations at distance t from a reference configuration N'{t) shows how the 
other conhgurations are distributed around the configuration. Figure [3] shows the entropy 
S{t) = (lnA/'(t))/(MlnM) obtained by an exhaustive enumeration algorithm for M = 6 
links. It is observed that structured configurations like x3x3 (with two x?> modules) are 
closer to the other conhgurations than the all-s (s6) and the all-p (p6) conhgurations. As 
the hgure shows, the diherence becomes clearer in the extended connectivity graph, where 
two link conhgurations are directly connected if they are related by LC-II or LC-M^. 


A. Distance of two matrix configurations 


The Hamming distance of two link conhgurations represented by matrices (A^,A^) is 
simply T)(Ai, A^) = for a given assignment k{l) of the link labels in the 

two matrices. Here, to ease the notation, we used k,k' for k{l),k{l'). When the links are 
indistinguishable and the contact labels are irrelevant, we dehne the distance measure 


^m(A\A2) 


min 

assignment 



( 1 ) 


If two matrix conhgurations A^, A^ are related by a LC-H then A^) = 1. We stress 

that in contrast to the other two measures, here the link and contact labels are not important; 
-D///(A^, A^) = 0 for any two link conhgurations that have the same number of modules with 
similar internal structures irrespective of their ordering. 


7 




FIG. 3. The exact entropy S{t) (logarithm of the number of link configurations) at distance t from 
some reference configurations {p6,s6,x6,p3p3,p3x3,x3x3). In the left panel, t shows the number 
of local changes of type II connecting two link configurations in the subspace of fixed M = 6. 
In the right panel, t shows the number of local changes of type II and connecting two link 
configurations with possibly different number of links M < 6. 


Finding the above minimum distance is a minimum assignment problem: we are to find 
a one-to-one assignment 1 —?■ k of the link labels in the two configurations minimizing the 
Hamming distance F)(A^,A^). Here, we briefly describe an approximate algorithm to find 
such an assignment. More details are given in Appendix [Al 


We consider the probability measure /i(k) oc of assignment k, where A^) 

plays the role of an energy function, and [3 is an inverse temperature controlling the typical 
Hamming distances. Then, we use Bethe approximation to compute the local probability 
marginals fii{k) of assigning I ^ k. In the Bethe approximation, the local marginals are 
written in terms of the cavity probability marginals fiii^i{k') of assigning I' —)■ k' in the ab¬ 
sence of link /. For a moment, suppose the interaction (dependency) graph of the variables, 
defined by the energy function, is a tree. Then, the cavity marginal fiii^i{k') can be written 
in terms of the other cavity marginals received from the neighboring variable nodes except 1. 
The recursive equations governing these cavity marginals are called the belief propagation 
(BP) equations [l2|, [2^. For arbitrary interaction graphs, the BP equations can still be used 
to find good estimations for the local probability marginals. The quality of this approxi¬ 
mation then depends on the structure and strength of the interactions. In our model, the 













approximated BP equations for the cavity marginals are given by 

n (^ *' j • (2) 

\k"^k' J 

We obtain the local marginals ^i{k) in the same way, but considering all the incoming cavity 
marginals The limit /3 —)■ cxo of the local marginals would be enough to find an 

approximate solution by a decimation algorithm, as explained in Appendix 

We used the above algorithm to find an approximate assignment minimizing the Hamming 
distance of two randomly generated link configurations. For M = 10,20,30,40 links we 
hnd the following estimations of the minimum distances 2Djjj/{M{M — 1)) = 0.371 ± 
0.008, 0.316 ± 0.006, 0.294 ± 0.007, 0.289 ± 0.006, respectively. 


IV. MINIMUM EVOLUTION PROBLEM 

Consider an evolution path of length T starting from link configuration Lq, ending up at 
configuration L^, and connecting neighboring link configurations. Define = 1 for two 
neighboring link configurations related by a local change, otherwise = 0. The aim is 
to hnd an optimal evolution minimizing an energy functional of the path, £^[Li , • • •, Lr-i], 
depending on the intermediate link conhgurations. To this end, we need to sample the space 
of possible pathways with the following dynamical partition function: 

2{lo Lt) = ^2 At,Lt-i • • • At+i,u • • • Ai,Lo- (3) 

Li,L2,...,Lt-1 

Each path has the statistical weight depending on the path energy and the 

inverse temperature parameter f3. At the end, we will need to take the limit [3 ^ ooio focus 
on the optimal pathways. 

We assume that the energy functional can be written as = Y2t=i 
where E{t) depends on the link conhguration at time step t, and E[t — l,f) is a function 
of the transformation from step t — 1 to t. More specihcally, we will take E{t) = —Np{t), 
and E{t — 1, t) = — J2q<q' \^q'Efq^q'it — 1, t) for q = p,s, X. Here Nq^qi is the number of 
contact pairs changed from q to q' . Simple folding models show that contact configurations 
with larger Np exhibit smaller folding times [ 2 ]. That is why we choose an energy functional 
that decreases with the number of parallel contact pairs. 
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The following study can also be done with more general energy functions, for instance, 
E{t) = ^^A(r)M(r;t) + Here M{r;t) is the number of links 

of length r = \ji — ii\ at time step t, and Ng{d;t) is the number of contact pairs of type 
q = p,s,x having distance d = \ii — ip\. We observed that such energy functions could be 
le study of complex link conhgurations in the presence of structural modules and 


useful in t 
sectors 


25|. 


For now, we consider simple paths with no loops, that is each configuration in the path 
is visited only once. Figure H] displays the optimal evolutions (obtained by an exhaustive 
search algorithm) connecting the x6 and p3x3 configurations by the local changes of type 
II, for a small number of links (M = 6). Besides, we report the degeneracy g (number 
of optimal paths) and the energy gap A between the optimal and the next optimal paths. 
These quantities provide a measure for stability of the optimal path. As the figure shows, for 
M = 6 links and T = 12, the optimal path from x6 —?■ p3x3 maximizing J\fp = 
has degeneracy g = 1008 and energy gap A = 31. Increasing the evolution time to T = 14, 
results in {g = 2688, A = 1). In addition, we observe that the optimal paths maximizing 
the number of transitions = Ylt=iWx^pit — l,t) + Nx^s{t — l,t)] have a very large 

degeneracy compared to those that maximize the total number of parallel contact pairs. In 
Appendix [D], we give other examples of evolution with the other local changes, connecting 
also link conhgurations with different number of links. 

Next, we study some statistical properties of the optimal paths in the connectivity graph 
of link conhgurations with M < 5 links. To this end, we take the adjacency graphs obtained 
by diherent choices of the local changes, and compare the number of optimal shortest paths 
maximizing A/),, with a given path length T, degeneracy g, and energy gap A. Table [T] 
gives the average value (o) and standard deviation ao = \/ (o^) — (o)^ of these quantities, in 
addition to values for the correlation coefficients r(o, o') = ((oo') — {o){o'))/{aoCTo'). Longer 
optimal paths are expected to have larger degeneracies and smaller gaps; that would result in 
positive and negative values for the correlation coefficients r(T, g) and r{T, A), respectively. 
As the table shows, this does note always happen, for example, for local changes of type 
II+M*. We also observe that the local changes of type II behave differently from the other 
local changes, with very strong fluctuations in g and A, along with very small correlation 
coefficients. In Fig. [5l we also display the probability distribution of g and A for the 
optimal shortest paths in the connectivity graphs ^/+m± and Gn+M±- Here, one may prefer 
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FIG. 4. Evolution with local changes of type II: and Djj(t) of the intermediate link 

configurations from the boundary configurations in the paths obtained by the exact algorithm for 
M = 6 links from the all-x (x6) configuration at t = 0 to a modular structure of two components 
(p3x3) at shortest distance Du = 12. Besides the shortest path (a), we display the optimal paths 
for r = 14 maximizing Mp (b), and the path maximizing Mx^p^s for x6 —>■ p3x3 (c) and p3x3 —>■ x6 
(d). Here t denotes the number of local changes of type II. The path degeneracy g and energy gap 
A are: {g = 1008, A = 31)^, (g = 2688, A = l)b, {g = 32904998, A = 14)^, {g = 32409638, A = 2)^. 




FIG. 5. Probability distribution of the degeneracy g, and gap between the optimal and the next 
optimal paths A, for shortest paths maximizing Mp in the connectivity graph of link conhgurations 
with M < 5 links using the LC-(I+M=*=) and LG-(II+M=*=). 


LC-(I+M±) to LC-(II+M±), as the optimal paths have in average smaller degeneracy and 
larger energy gap in the former than the latter case. 

Consider two link conhgurations Lq, \-t connected by a sequence of local changes, that 
is \-T = ut-'-uiLq. There could be different orderings of the LCs connecting the same 
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((^)>o-a) 

r{T, g) 

r(r,A) 

r(5r. A) 

LC-I 

(3.216,0.770) 

(1.835,1.483) 

(4.192,4.343) 

0.264 

-0.09 

0.102 

LC-II 

(6.134,2.129) 

(4.939,61.762) 

(7.937,10.608) 

0.037 

0.022 

0.041 

LC-(I+M±) 

(2.760,0.623) 

(1.439,0.889) 

(2.937,2.209) 

0.227 

-0.068 

-0.096 

LC-(II-hM±) 

(3.418,0.924) 

(1.684,1.371) 

(2.102,1.339) 

0.291 

-0.175 

-0.08 

LC-{l+M*) 

(2.374,0.572) 

(1.451,0.940) 

(3.636,2.844) 

0.225 

0.021 

-0.005 

LC-{ll+M*) 

(2.697,0.667) 

(1.579,1.134) 

(4.416,3.694) 

0.217 

0.121 

0.021 


TABLE I. Statistical properties of the optimal shortest paths maximizing Mp in the connectivity 
graphs of link configurations with M < 5 links obtained by different local changes: I, II, I+M^, 
II+M^, I+M*, II+M*. Here and A denote the path length, the degeneracy of the optimal 
path, and the energy gap between the optimal and the next optimal paths, respectively. The data 
in case LC-II are restricted to paths of length T < 12 to reduce the computation time. The average 
and standard deviation of variable o are denoted by (o) and Uq- The correlation coefficient of two 
variables o, o' is computed by r{o,o') = {{oo') — {o){o'))/{aoC^o')- 


boundary configurations. The question is how these different orderings affect a macroscopic 
behavior (phenotype) of the chain, for example a monotonically increasing (htness) function 
of the Np^s^x- For simplicity, here we focus on the case of two local changes u, v of type II. 
We say the local change u commutes with v in the context of L if vuL = uvL. In table I 
of Appendix [Bl we summarize the possible effects of two commutative local changes on a 
contact conhguration. As the table shows, the changes in numbers Np^s^x in the two paths 
connecting L to L' = vuL = uvL are correlated depending on how that quantity changes 
from L to L'. In particular, when Nq increases (or decreases) the corresponding changes in 
the two paths can not have different signs. 

The optimal paths we obtained so far were simple with no loops (also called off-pathways). 
The off-pathways (if allowed) can localize the dynamics in a small region of the conhguration 
space wasting the evolution time. To escape from these traps, one may increase the path 
length T or somehow disturb the system in the hope of hnding another path dominating 
the off-pathways, but probably another set of off-pathways would appear. See Table II in 
Appendix [D], for some examples of evolution in the presence of off-pathways. 
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A. An approximate evolution algorithm 


For larger number of links, one has to think of other approximate algorithms. To be 
specihc, in the following we consider only evolutions with the local changes of type II. We 
will need to introduce some auxiliary variables to represent the global constraint of path 
connectivity in terms of local constraints amenable to local message-passing algorithms. 
Then, we will apply Bethe approximation to obtain an efficient way of dealing with the 
above minimum evolution problem. 

Let us label the links from left to right according to the order of their hrst endpoints. To 
shorten the evolution time, we allow for more than one LC-II in each step of the evolution 
and represent the transformation from Lt_i to by a matching of the links u(t); links I and 
I' are involved in a local change of type II if uai{t) = 1, otherwise ua'{t) = 0. Moreover, 
uiu{t) could be 1 only if links I and I' have neighboring endpoints on the chain. The matching 
property of u(t) means that if uiu{t) = 1 then uunit) = = 0 for all I" ^ 1,1'. This 

property ensures that order of the local changes in one step is not important, therefore, we 
can uniquely determine ei(t),eif{t) from efc(t — 1), efc/(f — 1) in case = 1. If necessary, 

we also interchange the labels {k, k') {I = k', I' = k) to ensure that in each step of the 
evolution the links are ordered according to their hrst endpoints. 

The statistical properties of the problem can be obtained from the following dynamical 
partition function 





( 4 ) 


u(l),u(2),...,u(r) 


Note that given the initial conhguration Lq and transformations {u(l), u(2),..., u(T)}, we 
can uniquely construct the conhguration at step t, that is U = L(u(l), u(2),..., u(t)|Lo). 
Here (5 l,Lt if L = Lt, otherwise it is zero. 

The above problem can be solved by a dynamic programming algorithm working with the 
cavity marginals and These are the probability of having conhguration 

Lj in the absence of the energy terms and constraints in the other segment of the pathway; 
i.e. {t,T] for the forward message and [0,t) for the backward message More 

precisely. 



( 5 ) 
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and similarly for /it^.t_i(U). 

Note that working with an exact representation for the cavity marginals is computation¬ 
ally very expensive for large problem sizes. Thus, we have to resort to reasonable approx¬ 
imations working with an efficient and succinct representation of the cavity messages; see 


21 


23| for examples in the quantum and dynamic cavity methods. Here we explain the 


main approximations used in this study. The reader can hnd more details in Appendix O 
We represent a link conhguration L by the set of endpoints ei = {ii,ji) and approximate 
the cavity marginals by a Bethe distribution 2l|, for instance, 


(L) Yi n 


( 6 ) 


t<a' 

Here and e^/) are the one-link and two-link marginals of 

Using this form of the cavity messages in the right hand side of Eq. O we employ the Bethe 
approximation to compute the two-link marginals e;/). To this end, we introduce 

auxiliary variables 61 that allow us to know how the local changes affect on link 1. More 
precisely, given ei{t) and 61 we will be able to recover ek{t — 1) in the previous time step. 
Here 61 takes a small number of values, as the number of possible local changes are small; 
the endpoints and the label of a link can at most change by ±1 in a LC-H. 

Given the cavity marginals obtain the local marginals ^f{ei,eii) from 

/it(Lj) oc This allow us to hnd an estimation of the number of 

possible link conhgurations at time step f, as explained in Appendix O Figure [6] displays 
the information obtained in this way for different inverse temperatures (3 with M = 10 
links. Finally, to concentrate on the minimum evolutions, we take the limit /9 ^ oo of 
the above equations to obtain the so called minsum equations 24], see Appendi x O These 
minsum equations are used in a reinforcement (smoothed decimation) algorithm 27|| to hnd 
an approximate optimal path for given boundary conditions and time steps T. 

By the above approximate algorithm, we can explore the conhguration space of larger 
number of links with larger evolution times. The time complexity of the algorithm grows like 
TM^, and it takes still a few hours of CPU time to hnd an approximate minimum path of 
length T = 10 for M = 10. There are a few points here to mention about the algorithm. In 
each step we are approximating the cavity marginals by a Bethe distribution that 

would degrade the algorithm performance as the evolution time T increases. But, we are 
working with coarse-grained time steps allowing for 0{M) local changes of type H in each 
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FIG. 6. The entropy S{t) in the intermediate steps of evolution from a random link conhguration 
at t = 0 to another random configuration at t = 10 for M = 10 links. The hgure also shows So{t), 
the entropy of configurations at distance t from the initial configuration, and {Np{t)) the average 
number of Np at time step t. The data have been obtained by the finite-temperature evolution 
algorithm with energy function £ = — Ylt=i 


step. This is good because the average shortest distance between two randomly selected link 
configurations would grow as M In M if the connectivity graph is close to a random graph 
of gA^inM means that we do not need to work with very large number of coarse¬ 

grained time steps. This of course is obtained in the expense of optimizing a coarse-grained 
dynamics instead of the more detailed one. 

As an example, we consider the evolution of M = 6 links from the all-x configuration x 6 
to a modular conhguration of two components p3x3. The shortest path has length T = 12 
with Ap = 30. For an optimal path of length T = 16 maximizing Ap, we obtain A/^ = 62 
by the exact algorithm. On the other hand, using the approximate algorithm, we obtain 
a path of 7 coarse-grained steps, shown in Fig. [TJ which can be decomposed into T = 19 
local changes of type II with Ap = 54. In the same hgure, we show a path from x6 to x3x3. 
Here the shortest path takes T = 9 steps and Ap = 0. For T = 13 we hnd J\fp = 22 by the 
exact algorithm whereas the approximate algorithm gives Ap = 28 in 7 coarse-grained steps 
consisting of 15 local changes of type II. Note that, however, these are not fair comparisons 
as the two algorithms are not optimizing exactly the same dynamics. In Appendix [D], we 
display more instances of evolutions obtained by the approximate algorithm for a larger 
system with M = 10 links and T = 9. 
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FIG. 7. An evolution path of M = 6 links for T = 7 coarse-grained steps from the all-x configuration 
to modular configurations of two components p3a;3 and x3x3 obtained by the approximate minimum 
evolution algorithm minimizing S = — The endpoints on the chain start from i = 0 

(at the bottom of the circle) and increase to i = 2M — 1 in the counter-clockwise direction. 

V. DISCUSSION 

Conformational transitions are ubiquitous in biomolecular systems, have significant func¬ 
tional role and are subject to evolutionary pressures. These transitions change properties of 
molecules including topology, surface properties and mechanical flexibility, and subsequently 
affect their affinity for interacting partners as well as functions. Despite existence of experi¬ 
mental data regarding the long-lived stable states of biomolecules, little or no experimental 
data are often available on the intermediate states along the conformational transition path¬ 
way associated with a function or evolution of a function. Theoretical and computational 
efforts are providing fundamental insights into this process. Here we provided a first theoret¬ 
ical framework for topological transition, i.e. conformational transitions that are associated 
with changes in molecular topology. 

We defined distance measures on the space of circuit topologies and then used the dis¬ 
tance measures to study how topologies evolve under the most basic protocols. Topology 
of a chain can change in two ways, by changing the number of contacts or by rearranging 
the contacts j^. Here we studied correlated disruption of two neighboring contacts and 
subsequent reformation of contacts via exchanging neighboring contact sites. This protocol 
is for example applicable to rearrangements due to co-variation of monomers that are close 
in primary sequence. It also assumes that two neighboring contacts have a higher chance of 
exchanging their partners due to physical proximity (closeness in sequence commonly leads 
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to proximity in 3D). We also studied the dynamics under a protocol in which two random 
contacts are disrupted and the free contact sites form a new contact pair. In addition, we 
considered the dynamics with variable number of links, where a contact conhguration can 
also change by creation or annihilation of a single contact. We dehned connectivity graphs 
and distance measures based on these local changes in the space of contact conhgurations, 
and studied their statistical properties. 

Our analysis revealed properties of the topological space and their implications for topo¬ 
logical dynamics. Using the exact algorithm for small number of links, we hnd: (1) The 
connectivity graph of link conhgurations that are related by the local changes of type II and 
shows that conhgurations are more concentrated around the modular conhgurations. 
(2) The optimal shortest paths minimizing the energy functional 8 = — behave 

very diherently in the connectivity graphs obtained by the LC-II; we observed very large 
huctuations in the degeneracy g and energy gap A, accompanied by nearly no correlation be¬ 
tween these quantities and the path length. As another example, the optimal shortest paths 
in exhibit in average smaller degeneracy and larger gap than the optimal shortest 

paths in Qii+m±- 


For larger number of links, we devised an approximate algorithm to estimate the number 
of intermediate link conhgurations connecting two boundary conhgurations in a path of 
length T. A zero-temperature limit of the algorithm allows us to hnd an approximate 
optimal path minimizing an additive energy functional of the evolution. The challenge 
here is the study of larger problem sizes by more accurate and efficient approximations, to 
investigate large deviations (rare events) in the energy landscape of the evolution. 


The simplicity of our model allowed us to illustrate the notion of topological dynamics 
without the need to consider the complexity of a real system. Despite being a critical 
determinant of polymer dynamics [2, |2^, topology is often not the only factor governing 
molecular functions and dynamics. Many other factors such as chemical nature of the 
monomers, geometric constraints, solvent properties as well as interacting molecules play 
critical roles. Further experimental, bioinformatic and theoretical studies are needed to test 
the relevance, applicability and predictability of the proposed distance measures and modes 
of dynamics in real world applications. 
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VI. CONCLUSION 


In this article, we explored how topology imposes constraints on chain dynamics and 
shapes the conformational search towards a desired state. The topology framework can 
be used for a host of applications in structural biology, evolutionary biology and materials 
sciences, among others. We conclude by suggesting possible directions for future research. 

The topology framework is expected to hnd applications in analysis of genome archi¬ 
tecture and dynamics. In particular, our approach can be readily applied to study the 
topologically associated domains (TADs) in chromosomes 


29 


3l| . Here, DNA acts as a 


polymer chain and binding proteins act as contacts bringing distinct sites on DNA together. 
Despite complexity of genomic architecture and nuclear environment, loci that are distant 
are often able to come to close proximity to forni physical contacts [3^ that subsequently 
lead to biochemical interactions and signalings |37|. Contacts between genomic loci may 
break and re-establish into a new arrangement. The arrangement of these contacts and 
the frequencies of contact formation between sites on the same or different chromosomes 
can be measured using Chromosome Conformation Capture (C3) technology and its more 


recent variants such as Hi-C techniques |32l-l35|. Various quantities analyzed in this article 


could assume a different meaning when put in this context: gap between, and degeneracy 


of, states would be related to the stability and 


then directly inform biological experiments 30 


frequency of 3D conformations. These could 
. Dynamical changes between topologi¬ 


cal states could be interpreted as reactions catalyzed by histone modihcations in the case of 
inter-phase chromosomes. 

Our approach can also be used to study dynamics and evolution of proteins. As such one 


fl. 


has to hrst extract circuit topology from coordinate hies, as described in [lH. One can then 
infer energy functions that serve to describe the statistical weight of contact conhgurations 
in a dynamical process. In Ref. (2^ we used local topological data to reconstruct such an 
energy function. There, we observed that appropriate forms of two-contact interactions are 
enough to describe simple structural orders like modules and sectors. 

Characterizing the nature of large deviations of stochastic quantities (e.g. work) in a 
nonequilibrium process (e.g. stretching a polymer in contact with a thermal bath) is essential 
to our understanding of biophysical systems and in general the principles of nonequilibrium 


systems 


39l-l43|. Here one needs powerful and efficient sampling algorithms to capture the 
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statistical properties of the rare events. The cavity method and algorithms we utilized in 
this work proved useful in the study of equilibrium glassy systems, and we expect to be 
helpful also in biophysical applications. 

Finally, we envision application of our approach to areas other than polymer physics. In 
particular, our study may be applicable to problems involving optimal evolution of dynamical 
(stochastic) systems. Linearly ordered objects with inter-object interactions are ubiquitous 
in nature and in applications ranging from economics and operations research to biology. 
Understanding the linear order and the arrangements of inter-object interactions are crucial 
for understanding the functions and dynamics of these systems. The former issue, known as 
linear ordering problem or job shop scheduling, has been intensely studied over the last few 
decades. The latter, however, remained less understood. Here, we studied polymer chains as 
a prototype of linear chain of objects with intra-chain interactions in its topological space. 
We hope that our results will further stimulate research along these lines. 
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Appendix A: Details of the minimum assignment algorithm 


Consider two M x M matrices and with elements A[/,^ G {p, s, x} defining the 
relative position of links I and I'. The aim is to find a one-to-one assignment k{l) of the 
links from 1 — k minimizing the Hamming distance D{A^,A‘^) = ~ ,)• Here, 

to shorten the notation, we use k, k' for /c(/), k{l'). 

We start from the local marginals pii{k) of the probability measure /i(k) oc of 

the assignments written in the Bethe approximation for a finite f3 12|, 


Pi{k) oc n ( 

V^l \k'^k 


-/3(1-<5a1 a 2 ) /,/x 

w’ kk' i_ii,^i{k ) 


(Al) 


The cavity marginals fiii^i{k') give the probability of assigning I' —)■ k' in the absence of link 
1. The recursive equations governing these cavity marginals are called the belief propagation 
equations [l2, 24|, 


Pv^i{k') oc n ( 

\k"^k' 


^A^ A^ ^ / 1 ff\ 


(A2) 


Vfc'Vfc' 

We can solve the equations by iteration starting from a random initial condition. 

But we are interested in the limit /3 —)■ oo of the equations concentrating on the optimal 
assignments minimizing T*(A^, A^). Assuming the scaling pi^ii{k) = for the cavity 

marginals, the limit f3 ^ oo oi the BP equations read 


hu^iik') = {(1 - - Cv^i. 


(A3) 


v'^ip 


Here is a constant to make min^/ hi'^iik') = 0. These equations are called minsum 


equations 


24|. 


We use the above equations in a reinforcement algorithm to fix smoothly the assignment 


variables 


27|. To this end, we use the information in the local marginals /i/(/c) = e to 


increase slowly an external field acting on the variables. The aim is to concentrate more and 
more the cavity and local marginals on a minimum assignment as the algorithm proceeds. 
More precisely, we start from random initial messages h^{k), h^^i,{k), and in each step we 
update the message in the following way: 

h\%],{k) = pi{k) + r{t)h\{k) + ^ min |(1 - 5ai„,a2^J + - Ci^v- (A4) 
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In the same way, we update the local messages 

hl+\k) = r]i{k) + r{t)hj{k) + ^ min |(1 - “ Q- (AS) 

Here r{t) is the reinforcement parameter; it is zero at the beginning and increases slowly 
by time as r{t + 1) = r{t) + 6r, for a small 6r ~ 0.01. In addition, we introduced a small 
noise rii{k) to the equations to reduce the number of possible minimum assignments. In each 
iteration one updates all the local and cavity messages selected in a random sequential way, 
according to the above equations. In the end, one obtains an assignment by looking at the 
local messages; that is / —?■ fc = aigmm hj{k). 

Appendix B: Ordering statistics of the local changes 

Consider two link configurations Lq, \-t connected by a sequence of local changes, that is 
Lt = ut • • • uiLq. To be specihc, we assume the us are local changes (LC) of type II, where 
u is an elementary permutation of the neighboring endpoints {iu,'iu + !)• There could be 
different orderings of the LCs connecting the same boundary configurations. The question 
is how these different orderings affect on a macroscopic behavior (phenotype) of the chain, 
for example a monotonically increasing (fitness) function of the Np^s,x- 

For simplicity, let us ignore the link labels and work with the connectivity patterns of 
the endpoints C. We will also focus on the simple case of two local changes u, v of type 
II. The local change u commutes with v in the context of C if vuC = uvC. Note that the 
transformations are reversible, that is from C' = vuC we obtain C = uvC'. The dehnitions 
can readily be extended to link conhgurations with distinguishable links as well. 

The two local changes u, v may involve two, three, or four distinct links; we will not 
consider permutation of neighboring endpoints that belong to a single link, because it has 
no effect. In table m we summarize the possible effects of two commutative local changes 
on a contact conhguration, considering only the nontrivial case of three links. One can 
easily construct the other cases with two or four links, following the above rules. Note that 
each transformation in the table can also happen in the reverse direction. As the table 
shows, the changes in numbers Np^s,x in the two paths are correlated depending on how 
that quantity changes form C to C'. In particular, when Ng increases (or decreases) the 
corresponding changes in the two paths do not have different signs; a LC-II only changes 
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P^PS^‘‘x’^^ 

Np 

Ns 

N^ 


Np 

Ns 

w 

p^ —>■ {p^x,p^x) —>■ 

(;,i) 

(-,-) 

44) 

s^ —>■ (s^x, s^x) ^ sx^ 

(-,-) 

44) 

(t, t) 

x^ (px^, sx^) —)■ psx 

(t,-) 

(-4) 

44) 

x^ —>■ (px^,px^) p^x 

(t4) 

(-,-) 

44) 

ps^ {psx, s^x) —>■ sx^ 

(-4) 

4,-) 

(t4) 

px^ —>■ (p^x,psx) —>■ p^s 

(t,-) 

(-4) 

44) 

px^ —>■ (psx, x^) —>■ sx^ 

(-4) 

(t,-) 

44) 

px^ —>■ (p^x, x^) px^ 

(t4) 

(-,-) 

44) 

p^s —)• {psx, psx) sx^ 

(14) 

(-,-) 

44) 

p^s (pSXjP^x) —)• px^ 

4,-) 

(-4) 

(41) 

p^x —>■ (pX^jP^s) —)■ psx 

4,-) 

(-4) 

(t4) 

p^x —(px^,p^) —)■ p^x 

44) 

(-,-) 

(t4) 

s^x {ps^, sx^) —)■ psx 

(t,-) 

(-4) 

44) 

s^x —)• (s^, sx^) ^ s^x 

(-,-) 

(t4) 

(4t) 

psx —>■ (sX^jP^s) —)■ psx 

44) 

(-,-) 

(t4) 






TABLE II. The set of distinct transformations C —> (uC, vC) —?■ C' = uvC = vuC obtained by 
two commutative local changes of type II applied on three links. The arrows in each column show 
the change in the numbers Np^s,x- positive ('|') or negative (4-). The hrst (second) arrow in the 
parenthesis corresponds to the first (second) transition. Here shows a configuration of 

Ng contact pairs of type q = p,s, x. 


the state of two links from x to (p,s), or from (p,s) to x. This means that in a LC-II we 
have 6Np^s = = ±1, and two LC-II can at most change Np^s^^ by two. Consequently, if 

Ng increases (decreases), the changes 6Nq resulted by the two LC-II can not have different 
signs, because they can not give the expected total variation in Ng. 

Appendix C: Details of the minimum evolution algorithm 

Let us start from the dynamical partition function 

^(Lq Lr) = ^ e“^^(5L(u(l),u(2),...,u{T)|Lo),LT> (Cl) 

u(l),u(2),...,u(T) 

where U = L(u(l), u(2 ),..., u(t)|Lo) is the link conhguration at time step t, and u(t) dehnes 
the position of the possible local changes. In the following, we assume the local changes are 
of type II. Here £ = ~ with E{t) = —Np{t), and E(t — l,t) = 

— J2q<q' ~ 1) We recall that a link conhguration is dehned by 

the endpoints of all the links, and any two links have different endpoints. A local-change 
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configuration u(t) = {uii^it) = 0,1| < 1} is a matching of neighboring links with 

adjacent endpoints along the contact chain. 

We solve the above problem by a dynamic programming (message-passing) algorithm: 
Dehne the cavity messages and as the probability of having link con- 

hguration in the absence of the energy terms and constraints imposed by the other part 
of the system; i.e. the segment (t,T] for the forward message and [0,t) for the back¬ 
ward message From the above partition function we can easily write the equations 

for these cavity marginals 

1 




Zt^t+1 

1 




u(t) 


e-0Em 


u(t+l) 


Then the total marginal at time step t is given by 


Zt 


The Zt^t±i and Zt are normalization constants. 


1. Approximating the messages 


(C2) 

(C3) 


(C4) 


We represent a link conhguration L by the set of endpoints e; = and label the links 

according to the order of their first endpoints. We also approximate the cavity messages by 
a Bethe distribution 

«n n , (C6) 

Using this structure for the cavity messages in the right hand side of the equations, we 
obtain the equations for the two-link marginals e^/), 




,-mt) 

\i{t) 




X 


- 1 )) n 


l),efc/(t- 1)) 


(C6) 


^ - 1 ))' 

We compute the sum in the right hand side of the above equation using the Bethe 


approximation 


12| . To this end, we introduce auxiliary variables 61 to see how the local 
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changes affect link 1. More precisely, given ei{t) and 61 we can recover the endpoints and 
the link label ek{t — 1) in previous step. Note that 61 takes a small number of values as 
the number of possible local changes of type II are small; the endpoints and the label of a 
link can at most change by ±1. The approximate two-link marginal e//) can be 

obtained by considering the constraints involving {ei,ei/), and by taking into account the 
effect of the remaining degrees of freedom. The latter is provided by a new set of cavity 
marginals h'i^u{ei(t);6l;uiii{t)) giving the probability of indicated variables in the absence 
of I'. Putting all together, we obtain 

(X ww{ei{t),ei^{t)-,6l,6l'-,uw{t)) 

,Uii/ {t) 

X ui^i>{ei{t);6l;uw{t))uir^i{eir{t);6l';uw{t)), (C7) 

where we dehned 

ww{ei{t),ev{t)]6l,6l']Uw{t)) = wu^i{ei(t),eu{t);6l,6l';uw{t))^’l_^^^{ekit - 1)), (C8) 

with 


wi>^i{ei{t), ei>{t)-, 61, 61'] uw(t)) = 1(5/, 61', uu'(t)\ei(t), 

X _ i)|efc(t - 1)). (C9) 


Here \{6l,6l',uiii{t)\ei{t),eii{t)) is an indicator function to ensure that: (i) e/(t) ^ eii{t), (ii) 
the links are labeled from left to right according to their hrst endpoints, and (iii) to check 
for the possibility of a local change given the endpoints and the 61,61',uii'(t). Moreover, 
/Ui“i^t(efc'(t-l)|efc(/:-l)) = - 1), efc(t - l))//i 4 ^_i^t(efc(t - 1)) is the conditional 

probability of eyit — 1) given Ckit — 1), and q(ei,e;/) G {p, s,a;} depending on the link 
endpoints. 

The i'i^i'{ei(t)]6l]Uii'(t)) are determined by the following Bethe equations: 


ui^i>{ei{t)]6l]0) oc n E wv'^i{ei{t), evi{t)] 61, 61"] Qi)vin_,i{evi{t)] 61"] 0) 

+ E E ei''{t)] 61,61"] l)viii^i{eit'{t)] 61"] 1) 


X n E 61,61'"]0)ui///^i{eiiii(t)] 61'"] 0) j , (CIO) 
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and, 


1) oc n E 61,61”61''-, 0) j . (Cll) 

Similarly, we obtain the cavity marginals and finally the local 

marginals read 


fif{ei{t), ei>{t)) oc ui^i>{ei{t))wii>{ei{t), ei:{t))ui>^i{ei>{t)), (C12) 


where now 


wu'{ei{t), ev{t)) = (t))n\^^_^{ei{t)), (C13) 


with 




and. 


vi^r{ei{t)) oc n E Wv^^i (e/ (t), ez// (t)) (e/// {t)) 


Given the local marginals ^\{ei{t)) and {ei{t 
time step t can be obtained by the Bethe entropy 


(CIS) 


an estimation of the entropy at 

3 , 


M 


S{t) = 


MlnM 




(C16) 


j<i' 


1=1 


where 


AS, = -^^,‘,^e,)\B,,[{el), (C17) 

e-l 

^Sii! = — ^ /if (ej. Cl') In/if (ej, e;/). (CIS) 

In snmmary, from Eq. IC7I we obtain the cavity marginals and simi¬ 

larly for /if4.i_i(e/(t), e/'(f)). Then, Eq. IC12I gives the local marginals fif {ei{t), eii{t)), which 
are used in Eq. IC16I to compute the Bethe entropy. 
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2. The zero temperature limit (3 ^ oo 


To take the limit f3 ^ oo, we assume the above probability distributions scale as 


AL,+i(e<W) = 


(C19) 

(C20) 


and similarly for the messages from f to t — 1. In addition, we dehne 

Now the zero temperature (minsum) equations read [^, 241. 


(C21) 

(C22) 






<5q(ei(t),e;/(p),p + l),efc/(f 1)) 

+ gi^v^eiit)] 51] Uivit)) + gi> -^i{e.i'{t)]5l']Uui{t))^, (C23) 

where the minimum is subject to the constraints in l{5l,5l',uui{t)\ei{t),eii{t)), and 

gi^i>{ei{t)]5l]Q) = min{F;°_^^,(ez(f),5/),^mn F/4z/(e«(f),(5/)}, (C24) 

gi^i,{ei{t)]5l]l) = F^_^,{ei{t),5l). (C25) 


Here we dehned 


F^UAei{t),6l)= fv'^i{ei{t),5l), 






(C26) 

(C27) 

(C28) 


with 


e,;/(p,5(":I,u,j//(t)=0 ' 


+ h^\-,t{ekit - 1), efc»(t - 1)) - - 1)) + gin^i{evt(t)] 51"] 0)|, (C29) 





and 


-,1 \{f^k[t-^),eyi{t-l))^q[ei(t),eii,[t)) Sq{ei{t),ei„{t)),p 

ei//{t),dl":I,Uii//(t)=l I 

+ ht\_^-i-{ek{t - 1), ek"{t - 1)) - ht_i^t{ek{t - 1)) + gv^i^evit)] 61"] 1)|. (C30) 

Similarly we obtain the minsum messages eii{t)), and finally the local mes¬ 

sages read 


^*'(^)) ~ ^q(ei{t),ei/{t)),p + (c;'(t), 6; (t) ) -|- (t), ei(t)) 

+ 9i^i'{ei{t)) + gp^i{ep{t)), (C31) 


and, 

gi^p{ei{t)) = V min \6^(^e,{t),ep,{t)),p + h“4t+i(ez-(t), e,(t)) 

- M^t+i(ez(t)) + h\^^^_^{ei»{t),ei{t)) - h\^t_^{ei{t)) + gi.^i{ei»{t))y (C32) 

We use the above equations in a reinforcement algorithm to find a minimum evolution 
path satisfying all the connectivity constraints. In a reinforcement algorithm, we use the 
information in the local messages hf {ei{t), ep(t)) to slowly polarize the cavity messages 
ep{t)) in the direction favored by the local messages, as we did in Appendix lAl 
for the minimum distance algorithm. 

In summary, the cavity messages are obtained by solving Eq. IC23I 

(similarly for These messages are used in Eq. IC31I to compute the 

local messages hf {ei{t),eii{t)) which are utilized in a reinforcement algorithm to find an 
approximate optimal pathway. 

Appendix D: More details and fignres 

In this section, we give more details of the numerical data and figures obtained in this 
study. 

We are interested in topological evolutions connecting two boundary contact configura¬ 
tions (Lq, Lt’) by a sequence of local changes in the link arrangements. More specifically, we 
look for optimal pathways of length T minimizing the energy functional 8 = — = 
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FIG. 8. Examples of contact configurations of M = 6 links used as boundary configurations in the 
minimum evolution algorithm: xQ (top), x33:3(middle), and p3x3 (bottom). 

—ATp, or S = —J2j^^[Nx^p(t — l,t)+Nx^s(t — l,t)] = —Mx^p^s- Here Ap is the total number 
of contact pairs of type p, and Afx^p^s gives the total number of contact pairs changed from 
X to p, s during the evolution. Figure E] shows some boundary contact conhgurations we use 
in the following examples. For now, we assume the paths are simple with no loops, that is 
each conhguration in the path is visited only once. 

Figure [9] shows the optimal pathways from the all-x (x6) conhguration of M = 6 links 
to the modular structure p3x3, following the local changes of type I. The results have been 
obtained by an exact algorithm searching in the space of all paths connecting the two 
boundary conhgurations. In Fig. [TUI we compare the optimal paths connecting two random 
link conhgurations of M = 5 links with local changes of type II and (II+M*). In the latter 
case, two link conhgurations are connected if they are related either by LC-II or LC-M*. 
Figure [TT] displays an example of evolution with variable number of links from x4 to p2x2. 

So far we have considered simple paths with no loops (also called oh-pathways). The 
oh-pathways can localize the dynamics in a small region of the conhguration space wasting 
the evolution time. To escape from these traps, one may increase the path length T in the 
hope of hnding another path dominating the oh-pathways, but probably another set of oh- 
pathways would appear. Another strategy is to perturb the system, for example, by adding 
the transition rates Mx^p^s to the original energy function E = However, 

we observe that in this case the oh-pathways are very robust. The reason is that the oh- 
pathways that maximize A/),, maximize also the number of these transitions making the 
above perturbations inehective. Table IIIII shows some examples of evolution in the presence 
of oh-pathways. 
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FIG. 9. Evolution with local changes of type I: (top) Np^s,x(t), and (bottom) Dj(t) of the inter¬ 
mediate link configurations from the boundary configurations in the paths obtained by the exact 
algorithm for M = 6 links. The paths connect the all-x (x6) configuration to a modular structure 
of two components {p3x3) at shortest distance Dj = 5. Besides the shortest path (a), we display 
the path maximizing Mp (b), and the path maximizing Mx^p,s for xQ —)■ p3x3 (c) and p3x3 —>■ xQ 
(d). Here t denotes the number of local changes of type I. The path degeneracy g and energy gap 
A are: (g = 2, A = 2)^, {g = 8,A = 2)b, {g = l,A = 4)^, (g = 1, A = 4)^. 


T 

Mp 

(5, A) 


{g,Arff 

off-pathway 

4 

31 

(1,1) 

31 

(1,1) 

- 

6 

58 

(3,1) 

58 

(3,1) 

- 

8 

86 

(24,1) 

86 

(24,1) 

- 

10 

113 

(48,1) 

114 

(24,1) 

—>• loop 

12 

142 

(48,1) 

143 

(24,1) 

—^loop —^loop 

14 

169 

(372,1) 

172 

(120,1) 

—>• loop —>• loop —>• loop 


TABLE III. The total number of parallel two-links J\fp, degeneracy of the optimal paths g, and 
energy gap A obtained by an exhaustive search algorithm with local changes of type II. The optimal 
paths maximizing Mp connect tow boundary conhgurations of M = 6 links at shortest distance 
Djj = 4. We compare the cases with and without off-pathways. An off-pathway which is a single 
loop appears for the first time at T = 10. By increasing the path length T, we observe that more 
loops appear following each other. 
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FIG. 10. Evolution with local changes of type II (a,b) and (II+M*) (c,d): (top) Np^s,x{t), and 
(bottom) distance D{t) of the intermediate link configurations from two random boundary config¬ 
urations of M = 5 links. The boundary configurations have shortest distance Djj = 3, and the 
optimal paths are obtained by the exact algorithm maximizing Mp. Besides the optimal shortest 
paths (a,c), we display the results for a larger evolution time T = 9 (b,d). Here t denotes the 
number of local changes of type II (a,b) or II-I-M* (c,d). The path degeneracy g and energy gap 
A are: {g = 1,A = 14)^, {g = 78, A = 1)^, (5 = 1, A = 2)c, {g = 64, A = 1)^. 


Figure [12] display the results we obtained by the approximate minimum-evolution algo¬ 
rithm for paths minimizing E = — Np(t). In this hgure, we show instances of evolution 

paths between two random link conhgurations for a larger number of links M = 10 and steps 
T = 9. 
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FIG. 11. Evolution with local changes of type (a) and II+M^ (b): (top) Np^s,x{t), and 

(bottom) distance D{t) of the intermediate link configurations from the boundary configurations for 
T = 10 steps. The boundary configurations (x6,p2x2) have different number of links with shortest 
distances 4(a) and 6(b). The optimal paths are obtained by the exact algorithm maximizing Afp. 
Here t denotes the number of local changes of type I+M^ (a) or II+M^ (b). The path degeneracy 
g and energy gap A are: {g = l,A = !)„, {g = 3,A = l)b. 
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t=0 t=l t=2 t=3 t=4 



FIG. 12. Evolution paths of M = 10 links for T = 9 coarse-grained steps from a random link 
configuration to another random configuration obtained by the approximate minimum evolution 
algorithm minimizing E = The endpoints on the chain start from i = 0 (at the 

bottom of the circle) and increase to i = 2M — 1 in the counter-clockwise direction. 
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